tf serving
How to Solve the Model Serving Component of the MLOps Stack - neptune.ai
Model serving and deployment is one of the pillars of the MLOps stack. In this article, I'll dive into it and talk about what a basic, intermediate, and advanced setup for model serving look like. Let's start by covering some basics. Training a machine learning model may seem like a great accomplishment, but in practice, it's not even halfway from delivering business value. For a machine learning initiative to succeed, we need to deploy that model and ensure it meets our performance and reliability requirements. You may say, "But I can just pack it into a Docker image and be done with it". In some scenarios, that could indeed be enough. When people talk about productionizing ML models, they use the term serving rather than simply deployment. So what does this mean?
Hosting Models with TF Serving on Docker
Training a Machine Learning (ML) model is only one step in the ML lifecycle. There's no purpose to ML if you cannot get a response from your model. You must be able to host your trained model for inference. There's a variety of hosting/deployment options that can be used for ML, with one of the most popular being TensorFlow Serving. TensorFlow Serving helps take your trained model's artifacts and host it for inference.
Deploying Machine Learning Models in Practice
The talk will cover various options for the most popular and widely used ML libraries, including MLeap, TF Serving and open standards such as PMML, PFA and the recently announced ONNX for Deep Learning. I will also introduce Aardpfark, initially covering Spark ML pipelines - as well as experimental work for exporting Spark ML pipelines to TensorFlow graphs for use with TF Serving.